SenseDefs: a multilingual corpus of semantically annotated textual definitions

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PACE Corpus: a multilingual corpus of Polarity-annotated textual data from the domains Automotive and CEllphone

In this paper, we describe a publicly available multilingual evaluation corpus for phrase-level Sentiment Analysis that can be used to evaluate real world applications in an industrial context. This corpus contains data from English and German Internet forums (1000 posts each) focusing on the automotive domain. The major topic of the corpus is connecting and using cellphones to/in cars. The pre...

متن کامل

A Semantically Annotated Swedish Medical Corpus

With the information overload in the life sciences there is an increasing need for annotated corpora, particularly with biological and biomedical entities, which is the driving force for data-driven language processing applications and the empirical approach to language study. Inspired by the work in the GENIA Corpus, which is one of the very few of such corpora, extensively used in the biomedi...

متن کامل

Developing a large semantically annotated corpus

What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the star...

متن کامل

YAWN: A Semantically Annotated Wikipedia XML Corpus

The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extr...

متن کامل

A Semantically Annotated Corpus from MEDLINE Abstracts

Automatic information extraction is a key technology to help researchers access the information contained in research papers and to extend databases on substances and biological processes. We aim to build information extraction databases [2] from biochemical papers and their abstracts available from the MEDLINE [3] database. To objectively measure the performance of our systems, we built a corp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Language Resources and Evaluation

سال: 2018

ISSN: 1574-020X,1574-0218

DOI: 10.1007/s10579-018-9421-3